A Comparison of Dependency Parsers for German

نویسندگان

  • Svetlana Smirnova
  • Sandra Kübler
چکیده

The thesis presents the analysis of dependency parsing systems that are used for parsing German. We restrict ourselves to four parsers operated on the same input data: two of the parsing systems are data-driven; other two are grammar-driven parsers. The main similarity of these parsers is that all of them are based on the same linguistic framework – the dependency theory described in the second chapter of the thesis. In the next chapters we analyze basic features, behavior and performance of each parsing system separately and the considerable part of the thesis is dedicated to the description of errors made by the parsers. Then we summarize their performance, their drawbacks and advantages in comparison to each other. One of our tasks is to show the ways for improving the performance of each parser. Even the MST parser, that shows the best parsing results (about 90% accuracy), can be improved. For future work we suggest to combine the described parsers as one of the possible ways to achieve better parsing results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Training Parsers on Partial Trees: A Cross-language Comparison

We present a study that compares data-driven dependency parsers obtained by means of annotation projection between language pairs of varying structural similarity. We show how the partial dependency trees projected from English to Dutch, Italian and German can be exploited to train parsers for the target languages. We evaluate the parsers against manual gold standard annotations and find that t...

متن کامل

Bootstrapping a neural net dependency parser for German using CLARIN resources

Statistical dependency parsers have quickly gained popularity in the last decade by providing a good trade-off between parsing accuracy and parsing speed. Such parsers usually rely on handcrafted symbolic features and linear discriminative classifiers to make attachment choices. Recent work replaces these with dense word embeddings and neural nets with great success for parsing English and Chin...

متن کامل

Making Ellipses Explicit in Dependency Conversion for a German Treebank

We present a carefully designed dependency conversion of the German phrase-structure treebank TiGer that explicitly represents verb ellipses by introducing empty nodes into the tree. Although the conversion process uses heuristics like many other conversion tools we designed them to fail if no reasonable solution can be found. The failing of the conversion process makes it possible to detect el...

متن کامل

Why is German Dependency Parsing More Reliable than Constituent Parsing?

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used [ , ]. Another direction ...

متن کامل

The PaGe 2008 Shared Task on Parsing German

The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006